Evaluating iterative optimization across 1000 datasets
نویسندگان
چکیده
منابع مشابه
Hyperparameter Importance Across Datasets
With the advent of automated machine learning, automated hyperparameter optimization methods are by now routinely used. However, this progress is not yet matched by equal progress on automatic analyses that yield information beyond performance-optimizing hyperparameter settings. In this work, we aim to answer the following two questions: Given an algorithm, what are generally its most important...
متن کاملEstimating effect size across datasets
Most NLP tools are applied to text that is different from the kind of text they were evaluated on. Common evaluation practice prescribes significance testing across data points in available test data, but typically we only have a single test sample. This short paper argues that in order to assess the robustness of NLP tools we need to evaluate them on diverse samples, and we consider the proble...
متن کاملEvaluating Iterative Compilation
This paper describes a platform independent optimisation approach based on feedback-directed program restructuring. We have developed two strategies that search the optimisation space by means of profiling to find the best possible program variant. These strategies have no a priori knowledge of the target machine and can be run on any platform. In this paper our approach is evaluated on three f...
متن کاملEvaluating SPARQL Queries on Massive RDF Datasets
Distributed RDF systems partition data across multiple computer nodes. Partitioning is typically based on heuristics that minimize inter-node communication and it is performed in an initial, data pre-processing phase. Therefore, the resulting partitions are static and do not adapt to changes in the query workload; as a result, existing systems are unable to consistently avoid communication for ...
متن کاملExploring similarities across high-dimensional datasets
Very often, related data may be collected by a number of sources, which may be unable to share their entire datasets for reasons like confidentiality agreements, dataset size, etc. However, these sources may be willing to share a condensed model of their datasets. If some substructure of the condensed models of such datasets, from different sources are found to be unusually similar, policies su...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGPLAN Notices
سال: 2010
ISSN: 0362-1340,1558-1160
DOI: 10.1145/1809028.1806647